AITopics | lstm cell

Collaborating Authors

lstm cell

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Bayesian Sparsity to Gated Recurrent Nets

Hao He, Bo Xin, Satoshi Ikehata, David Wipf

Neural Information Processing SystemsNov-21-2025, 13:31:28 GMT

However, the Achilles' heel of all these approaches is that

artificial intelligence, iteration, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Dense Video Captioning using Graph-based Sentence Summarization

Zhang, Zhiwang, Xu, Dong, Ouyang, Wanli, Zhou, Luping

arXiv.org Artificial IntelligenceJun-26-2025

--Recently, dense video captioning has made attractive progress in detecting and captioning all events in a long untrimmed video. Despite promising results were achieved, most existing methods do not sufficiently explore the scene evolution within an event temporal proposal for captioning, and therefore perform less satisfactorily when the scenes and objects change over a relatively long proposal. T o address this problem, we propose a graph-based partition-and-summarization (GPaS) framework for dense video captioning within two stages. For the "partition" stage, a whole event proposal is split into short video segments for captioning at a finer level. For the "summarization" stage, the generated sentences carrying rich description information for each segment are summarized into one sentence to describe the whole event. We particularly focus on the "summarization" stage, and propose a framework that effectively exploits the relationship between semantic words for summarization. We achieve this goal by treating semantic words as nodes in a graph and learning their interactions by coupling Graph Convolutional Network (GCN) and Long Short T erm Memory (LSTM), with the aid of visual cues. Two schemes of GCN-LSTM Interaction (GLI) modules are proposed for seamless integration of GCN and LSTM. The effectiveness of our approach is demonstrated via an extensive comparison with the state-of-the-arts methods on the two benchmarks ActivityNet Captions dataset and Y ouCook II dataset. ENSE video captioning, which aims at detecting all events and giving language descriptions in an untrimmed long video, is a very challenging problem in computer vision and has attracted a lot of research attentions recently. This task consists of two sub-tasks: 1) temporal proposal generation to localize the events and 2) video captioning to describe the events.

artificial intelligence, machine learning, module, (14 more...)

arXiv.org Artificial Intelligence

2506.20583

Country:

Oceania > Australia > New South Wales > Sydney (0.14)
Asia > China > Hong Kong (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Education (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Applied Machine Learning Methods with Long-Short Term Memory Based Recurrent Neural Networks for Multivariate Temperature Prediction

Lukić, Bojan

arXiv.org Artificial IntelligenceMar-8-2025

This paper gives an overview on how to develop a dense and deep neural network for making a time series prediction. First, the history and cornerstones in Artificial Intelligence and Machine Learning will be presented. After a short introduction to the theory of Artificial Intelligence and Machine Learning, the paper will go deeper into the techniques for conducting a time series prediction with different models of neural networks. For this project, Python's development environment Jupyter, extended with the TensorFlow package and deep-learning application Keras is used. The system setup and project framework are explained in more detail before discussing the time series prediction. The main part shows an applied example of time series prediction with weather data. For this work, a deep recurrent neural network with Long Short-Term Memory cells is used to conduct the time series prediction. The results and evaluation of the work show that a weather prediction with deep neural networks can be successful for a short time period. However, there are some drawbacks and limitations with time series prediction, which will be discussed towards the end of the paper.

prediction, series prediction, time series prediction, (16 more...)

arXiv.org Artificial Intelligence

2503.06278

Country:

South America > Chile (0.04)
South America > Brazil (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Differential Machine Learning for Time Series Prediction

Yadav, Akash, Nualart, Eulalia

arXiv.org Artificial IntelligenceMar-8-2025

Accurate time series prediction is challenging due to the inherent nonlinearity and sensitivity to initial conditions. We propose a novel approach that enhances neural network predictions through differential learning, which involves training models on both the original time series and its differential series. Specifically, we develop a differential long short-term memory (Diff-LSTM) network that uses a shared LSTM cell to simultaneously process both data streams, effectively capturing intrinsic patterns and temporal dynamics. Evaluated on the Mackey-Glass, Lorenz, and R\"ossler chaotic time series, as well as a real-world financial dataset from ACI Worldwide Inc., our results demonstrate that the Diff- LSTM network outperforms prevalent models such as recurrent neural networks, convolutional neural networks, and bidirectional and encoder-decoder LSTM networks in both short-term and long-term predictions. This framework offers a promising solution for enhancing time series prediction, even when comprehensive knowledge of the underlying dynamics of the time series is not fully available.

architecture, prediction, time sery, (15 more...)

arXiv.org Artificial Intelligence

2503.03302

Country:

Oceania > Australia (0.14)
Europe > Spain (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre:

Research Report > New Finding (0.87)
Research Report > Promising Solution (0.54)

Industry: Banking & Finance > Trading (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural Networks

Neural Information Processing SystemsJan-26-2025, 17:36:24 GMT

The paper starts by showing empirically and theoretically that saliency maps generated using gradient vanishes over long sequences in LSTMs. The authors propose a modification to the LSTM cell, called LSTM with cell-attention, which can attend to all previous time steps. They show that this approach improve considerably the saliency on the input sequence. They also test their approach on the fMRI dataset of the Human Connectome Project (HCP). Originality: The proposed LSTM with cell-attention is a novel combination of well-known techniques.

dataset, synthetic dataset, time step, (9 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Seq2Seq Model-Based Chatbot with LSTM and Attention Mechanism for Enhanced User Interaction

Benaddi, Lamya, Ouaddi, Charaf, Souha, Adnane, Jakimi, Abdeslam, Rahouti, Mohamed, Aledhari, Mohammed, Oliveira, Diogo, Ouchao, Brahim

arXiv.org Artificial IntelligenceDec-27-2024

A chatbot is an intelligent software application that automates conversations and engages users in natural language through messaging platforms. Leveraging artificial intelligence (AI), chatbots serve various functions, including customer service, information gathering, and casual conversation. Existing virtual assistant chatbots, such as ChatGPT and Gemini, demonstrate the potential of AI in Natural Language Processing (NLP). However, many current solutions rely on predefined APIs, which can result in vendor lock-in and high costs. To address these challenges, this work proposes a chatbot developed using a Sequence-to-Sequence (Seq2Seq) model with an encoder-decoder architecture that incorporates attention mechanisms and Long Short-Term Memory (LSTM) cells. By avoiding predefined APIs, this approach ensures flexibility and cost-effectiveness. The chatbot is trained, validated, and tested on a dataset specifically curated for the tourism sector in Draa-Tafilalet, Morocco. Key evaluation findings indicate that the proposed Seq2Seq model-based chatbot achieved high accuracies: approximately 99.58% in training, 98.03% in validation, and 94.12% in testing. These results demonstrate the chatbot's effectiveness in providing relevant and coherent responses within the tourism domain, highlighting the potential of specialized AI applications to enhance user experience and satisfaction in niche markets.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.00049

Country:

North America > United States > Texas (0.46)
Africa > Middle East > Morocco > Drâa-Tafilalet Region (0.26)

Genre: Research Report > New Finding (0.66)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probing for Consciousness in Machines

Immertreu, Mathis, Schilling, Achim, Maier, Andreas, Krauss, Patrick

arXiv.org Artificial IntelligenceNov-25-2024

This study explores the potential for artificial agents to develop core consciousness, as proposed by Antonio Damasio's theory of consciousness. According to Damasio, the emergence of core consciousness relies on the integration of a self model, informed by representations of emotions and feelings, and a world model. We hypothesize that an artificial agent, trained via reinforcement learning (RL) in a virtual environment, can develop preliminary forms of these models as a byproduct of its primary task. The agent's main objective is to learn to play a video game and explore the environment. To evaluate the emergence of world and self models, we employ probes-feedforward classifiers that use the activations of the trained agent's neural networks to predict the spatial positions of the agent itself. Our results demonstrate that the agent can form rudimentary world and self models, suggesting a pathway toward developing machine consciousness. This research provides foundational insights into the capabilities of artificial agents in mirroring aspects of human consciousness, with implications for future advancements in artificial intelligence.

machine learning, natural language, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2411.16262

Country:

Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (0.93)
Leisure & Entertainment (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues (1.00)
(3 more...)

Add feedback

From Bayesian Sparsity to Gated Recurrent Nets

Hao He, Bo Xin, Satoshi Ikehata, David Wipf

Neural Information Processing SystemsOct-4-2024, 09:08:30 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, iteration, sbl iteration, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

Enhancing Energy-efficiency by Solving the Throughput Bottleneck of LSTM Cells for Embedded FPGAs

Qian, Chao, Ling, Tianheng, Schiele, Gregor

arXiv.org Artificial IntelligenceNov-25-2023

To process sensor data in the Internet of Things(IoTs), embedded deep learning for 1-dimensional data is an important technique. In the past, CNNs were frequently used because they are simple to optimise for special embedded hardware such as FPGAs. This work proposes a novel LSTM cell optimisation aimed at energy-efficient inference on end devices. Using the traffic speed prediction as a case study, a vanilla LSTM model with the optimised LSTM cell achieves 17534 inferences per second while consuming only 3.8 $\mu$J per inference on the FPGA XC7S15 from Spartan-7 family. It achieves at least 5.4$\times$ faster throughput and 1.37$\times$ more energy efficient than existing approaches.

fpga, lstm cell, lstm model, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-23618-1_40

2310.16842

Country: Europe > Germany (0.04)

Genre: Research Report (0.50)

Industry: Semiconductors & Electronics (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RigLSTM: Recurrent Independent Grid LSTM for Generalizable Sequence Learning

Wang, Ziyu, Jiang, Wenhao, Zhang, Zixuan, Tang, Wei, Yan, Junchi

arXiv.org Artificial IntelligenceNov-3-2023

Abstract--Sequential processes in real-world often carry a combination of simple subsystems that interact with each other in certain forms. Learning such a modular structure can often improve the robustness against environmental changes. In this paper, we propose recurrent independent Grid LSTM (RigLSTM), composed of a group of independent LSTM cells that cooperate with each other, for exploiting the underlying modular structure of the target task. Our model adopts cell selection, input feature selection, hidden state selection, and soft state updating to achieve a better generalization ability on the basis of the recent Grid LSTM for the tasks where some factors differ between training and evaluation. Specifically, at each time step, only a fraction of cells are activated, and the activated cells select relevant inputs and cells to communicate with. At the end of one time step, the hidden states of the activated cells are updated by considering the relevance between the inputs and the hidden states from the last and current time steps. Extensive experiments on diversified sequential modeling tasks are conducted to show the superior generalization ability when there exist changes in the testing environment. A certain patterns and characterizing real-world dynamic processes, such as component is corresponding to a certain part of the environment. Therefore, models adopt such reinforcement learning for intelligent agents [11], [12].

information, riglstm, selection, (15 more...)

arXiv.org Artificial Intelligence

2311.02123

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Illinois (0.04)
North America > Canada > British Columbia (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback